Search Results: "Sam Hartman"

12 August 2017

Sam Hartman: Debian: a Commons of Innovation

I recently returned from Debconf. This year at Debconf, Matthew Garrett gave a talk about the next twenty years in free software. In his talk he raised concerns that Debian might not be relevant in that ecosystem and talked about some of the trends that contribute to his concerns.
I was talking to Marga after the talk and she said that Debian used to be a lot more innovative than it is today.
My initial reaction was doubt; what she said didn't feel right to me. At the time I didn't have a good answer. Since then I've been pondering the issue, and I think I have a partial answer to both Marga and Matthew and so I'll share it here.
In the beginning Debian focused on a lot of technical innovations related to bringing an operating system together. We didn't understand how to approach builds and build dependencies in a uniform manner. Producing packages in a clean environment was hard. We didn't understand what we wanted out of packages in terms of a uniform approach to configuration handling and upgrades. To a large extent we've solved those problems.
However, as the community has grown, our interests are more diverse. Our users and free software (and the operating system we build together) are what bring us together: we still have a central focus. However, no one technical project captures us all. There's still significant technical innovation in the Debian ecosystem. That innovation happens in Debian teams, companies and organizations that interact with the Debian community. We saw several talks about such innovation this year. I found the talk about ostree and flatpak interesting, especially because it focused on people in the broader Debian ecosystem valuing Debian along with some of the same technologies that Matthew is worried will undermine our relevance.
Matthew talked about how Debian ends up being a man-in-the-middle. We're between users and developers. we're between distributions and upstreams. Users are frustrated because we hold back the latest version of software they want from getting to them. Developers are frustrated because we present our users with old versions of their software configured not as they would like, combined with different dependencies than they expect.
All these weaknesses are real.
However, I think Debian-in-the-middle is our greatest strength both on the technical and social front.
I value Debian because I get a relatively uniform interface to the software I use. I can take one approach to reporting bugs whether they are upstream or Debian specific. I expect the software to behave in uniform ways with regard to things covered by policy. I know that I'm not going to have to configure multiple different versions of core dependencies; for the most part system services are shared. When Debian has value it's because our users want those things we provide. Debian has also reviewed every source file in the software we ship to understand the license and license compatibility. As a free software supporter and as someone who consumes software in commercial context, that value alone is enormous.
The world has evolved and we're facing technologies that provide different models. They've been coming for years: Python, Ruby, Java, Perl and others have been putting together their own commons of software. They have all been working to provide virtualization to isolate one program (and its dependencies) from another. Containerization takes that to the next level. Sometimes that's what our users want.
We haven't figured out what the balances are, how we fit into this new world. However, I disagree with the claim that we aren't even discussing the problem. I think we're trying a lot of things off in our own little technical groups. We're just getting to the level of critical mass of understanding where we can take advantage of Debian's modern form of innovation.
Because here's the thing. Debian's innovation now is social, not technical. Just as we're in the middle technically, we're in the middle socially. Upstreams, developers, users, derivatives, and all the other members of our community work together. we're a place where we can share technology, explore solutions, and pull apart common elements. This is the first Debconf where it felt like we'd explored some of these trends enough to start understanding how they might fit together in a whole. Seven years ago, it felt like we were busy being convinced the Java folks were wrong-headed. A couple of years later, it felt like we were starting to understand our users' desires that let to models different than packaging, but we didn't have any thoughts. At least in my part of the hallway it sounded like people were starting to think about how they might fit parts together and what the tradeoffs would be.
Yes, Matthew's talk doubtless sparked some of that. I think he gave us a well deserved and important wake-up call. However, I was excited by the discussion prior to Thursday.
What I'm taking a way is that Debian is valuable when there's a role in the middle. Both socially and technically we should capitalize on situations where something between makes things better and get out of the way where it does not.

9 April 2017

Sam Hartman: When "when" is too hard a question: SQLAlchemy, Python datetime, and ISO8601

A new programmer asked on a work chat room how timezones are handled in databases. He asked if it was a good idea to store things in UTC. The senior programmers all laughed as we told some of our horror stories with timezones. Yes, UTC is great; if only it were that simple.
About a week later I was designing the schema for a blue sky project I'm implementing. I had to confront time in all its Pythonic horror.
Let's start with the datetime.datetime class. Datetime objects optionally include a timezone. If no timezone is present, several methods such as timestamp treat the object as a local time in the system's timezone. The timezone method returns a POSIX timestamp, which is always expressed in UTC, so knowing the input timezone is important. The now method constructs such an object from the current time.
However other methods act differently. The utcnow method constructs a datetime object that has the UTC time, but is not marked with a timezone. So, for example datetime.fromtimestamp(datetime.utcnow().timestamp()) produces the wrong result unless your system timezone happens to have the same offset as UTC.
It's also possible to construct a datetime object that includes a UTC time and is marked as having a UTC time. The utcnow method never does this, but you can pass the UTC timezone into the now method and get that effect. As you'd expect, the timestamp method returns the correct result on such a datetime.
Now enter SQLAlchemy, one of the more popular Python ORMs. Its DATETIME type has an argument that tries to request a column capable of storing a a timezone from the underlying database. You aren't guaranteed to get this though; some databases don't provide that functionality. With PostgreSQL, I do get such a column, although something in SQLAlchemy is not preserving the timezones (although it is correctly adjusting the time). That is, I'll store a UTC time in an object, flush it to my session, and then read back the same time represented in my local timezone (marked as my local timezone). You'd think this would be safe.
Enter SQLite. SQLite makes life hard for people wanting to store time; it seems to want to store things as strings. That's fairly incompatible with storing a timezone and doing any sort of comparisons on dates. SQLAlchemy does not try to store a timezone in SQLite. It just trims any timezone information from the datetime. So, if I do something like
d = datetime.now(timezone.utc)
obj.date_col = d
session.add(obj)
session.flush()
assert obj.date_col == d # fails
assert obj.date_col.timestamp() == d.timestamp() # fails
assert d == obj.date_col.replace(tzinfo = timezone.utc) # finally succeeds

There are some unfortunate consequences of this. If you mark your datetimes with timezone information (even if it is always the same timezone), whether two datetimes representing the same datetime compare equal depends on whether objects have been flushed to the session yet. If you don't mark your objects with timezones, then you may not store timezone information on other databases.
At least if you use only the methods we've discussed so far, you're reasonably safe if you use local time everywhere in your application and don't mark your datetimes with timezones. That's undesirable because as our new programmer correctly surmised, you really should be using UTC. This is particularly true if users of your database might span multiple timezones.
You can use UTC time and not mark your objects as UTC. This will give the wrong data with a database that actually does support timezones, but will sort of work with SQLite. You need to be careful never to convert your datetime objects into POSIX time as you'll get the wrong result.
It turns out that my life was even more complicated because parts of my project serialize data into JSON. For that serialization, I've chosen ISO 8601. You've probably seen that format: '2017-04-09T18:17:27.340410+00:00. Datetime provides the convenient isoformat method to print timestamps in the ISO 8601 format. If the datetime has a timezone indication, it is included in the ISO formatted string. If not, then no timezone indication is included. You know how I mentioned that datetime takes a string without a timezone marker as local time? Yeah, well, that's not what 8601 does: UTC all the way, baby! And at least the parser in the iso8601 module will always include timezone markers. So, if you use datetime to print a timestamp without a timezone marker and then read that back in to construct a new datetime on the deserialization side, then you'll get the wrong time. OK, so mark things with timezones then. Well, if you use local time, then the time you get depends on whether you print the ISO string before or after session flush (before or after SQLAlchemy trims the timezone information as it goes to SQLite).
It turns out that I had the additional complication of one side of my application using SQLite and one side using PostgreSQL. Remember how I mentioned that something between SQLAlchemy and PostgreSQL was recasting my times in local timezone (although keeping the time the same)? Well, consider how that's going to work. I serialize with the timezone marker on the PostgreSQL side. I get a ISO8601 localtime marked with the correct timezone marker. I deserialize on the SQLite side. Before session flush, I get a local time marked as localtime. After session flush, I get a local time with no marking. That's bad. If I further serialize on the SQLite side, I'll get that local time incorrectly marked as UTC. Moreover, all the times being locally generated on the SQLite side are UTC, and as we explored, SQLite really only wants one timezone in play.
I eventually came up with the following approach:

  1. If I find myself manipulating a time without a timezone marking, assert that its timezone is UTC not localtime.

  2. Always use UTC for times coming into the system.

  3. If I'm generating an ISO 8601 time from a datetime that has a timezone marker in a timezone other than UTC, represent that time as a UTC-marked datetime adjusting the time for the change in timezone.


This is way too complicated. I think that both datetime and SQLAlchemy's SQLite time handling have a lot to answer for. I think SQLAlchemy's core time handling may also have some to answer for, but I'm less sure of that.

29 January 2017

Sam Hartman: Network Audio Visualization: Network Modeling

Previously, I wrote about my project to create an audio depiction of network traffic. In this second post, I explore how I model aspects of the network that will be captured in the audio representation. Before getting started, I'll pass along a link. This is not the first time someone has tried to put sound to packets flying through the ether: I was pointed at Peep. I haven't looked at Peep, but will do so after I finish my own write up. Not being an academic, I feel no obligation to compare and contrast my work to others:-)
I started with an idea of what I'd like to hear. One of my motivations was to explore some automated updates we run at work. So, I was hoping to capture the initial DNS and ARP traffic as the update discovered the systems it would contact. Then I was hoping to capture the ssh and other traffic of the actual update.
To Packet or Stream
One of the simplest things to do would simply be to model network packets. For DNS I chose that approach. I was dubious that a packet-based model would capture the aspects of TCP streams I typically care about. I care about the source and destination (both address and port) of course. However I also care about how much traffic is being carried over the stream and the condition of the stream. Are there retransmits? Are there a bunch of unanswered SYNs? But I don't care about the actual distribution of packets. Also, a busy TCP stream can generate thousands of packets a second. I doubted my ability to distinguish thousands of sounds a second at all, especially while trying to convey enough information to carry stream characteristics like overall traffic volume.
So, for TCP, I decided to model some characteristics of streams rather than individual packets.
For DNS, I decided to represent individual requests/replies.
I came up with something clever for ARPP. There, I model the request/reply as an outstanding request. A lot of unanswered ARPs can be a sign of a scan or a significant problem. The mornful sound of a TCP stream trailing off into an unanswered ARP as the cache times out on a broken network is certainly something I'd like to capture. So, I track when an ARP request is sent and when/if it is answered.
Sound or Music
I saw two approaches. First, I could use some sound to represent streams. As an example, a running diesel engine could make a great representation of a stream. The engine speed could represent overall traffic flow. There are many opportunities for detuning the engine to represent various problems that can happen with a stream. Perhaps using stereo separation and slightly different fundamental frequencies I could even represent a couple of streams and still be able to track them.
However, at least with me as a listener, that's not going to scale to a busy network. The other option I saw was to try and create melodic music with various musical phrases modified as conditions within the stream or network changed. That seemed a lot harder to do, but humans are good at listening to complicated music.
I ended up deciding that at least for the TCP streams, I was going to try and produce something more musical than sound. I was nervous: I kept having visions of a performance of "Peter and the Wolf" with different instruments representing all the characters that somehow went dreadfully wrong.
As an aside, the decision to approach music rather than sound depended heavily on what I was trying to capture. If I'm modeling more holistic properties of a system--for example, total network traffic without splitting into streams--I think parameterized sounds would be a better approach.
The decision to approach things musically affected the rest of the modeling. Somehow I was going to need to figure out notes to play. I'd already rejected the idea of modeling packets, so I wouldn't simply be able to play notes when a packet arrived.
Energy Decay
As I played with various options, I realized that the critical challenge would be figuring out how to focus the listener's attention on the important aspects of what was going on. Clutter was the great enemy. My job would be figuring out how to spend sound wisely. When something interesting happened, that part of the model should get more focus--more of the listener's energy.
Soon I found myself thinking a lot about managing the energy of network streams. I imagined streams getting energy when something happened, and spending that energy to convey that interesting event to the listener. Energy needed to accumulate fast enough that even low-traffic streams could be noticed. Energy needed to be spent fast enough that old events were not taking listener focus from new, interesting things going on. However, if the energy were spent slow enough, then network events could be smoothed out to give a better picture of the stream rather than individual packets.
This concept of managing some decaying quantity and managing the rate of decay proved useful at multiple levels of the model.
Two Layer Model
I started with a python script that parses tcpdump output. It associates a packet with a stream and batches packets together to avoid overloading other parts of the system.
The output of this script are stream events. Events include a source and destination address, a stream ID, traffic in each direction, and any special events on the stream.
For DNS, the script just outputs packet events. For ARP, the script outputs request start, reply, and timeout events. There's some initial support for UDP, but so far that doesn't make sound.
Right now, FINs are modeled, but SYNs and the interesting TCP conditions aren't directly modeled. If you get retransmissions you'll notice because packet flow will decrease. However, I'd love to explicitly sound retransmissions. I also think a window filling as an application fails to read is important. I imagine either narrowing a band-pass filter to clamp the audio bandwidth available to a stream with a full window. Or perhaps taking it the other direction and adding an echo.
The next layer down tracks the energy of each stream. But that, and how I map energy into music, is the topic of the next post.

23 January 2017

Sam Hartman: Cudos to GDB Contributors

I recently diagnosed a problem in Debian's pam-p11 package. This package allegedly permits logging into a computer using a smart card or USB security token containing an ssh key. If you know the PIN and have the token, then your login attempt is authorized against the ssh authorized keys file. This seems like a great way to permit console logins as root to machines without having a shared password. Unfortunately, the package didn't work very well for me. It worked once, then all future attempts to use it segfaulted. I'm familiar with how PAM works. I understand the basic ideas behind PKCS11 (the API used for this type of smart card), but was completely unfamiliar with this particular PAM module and the PKCS11 library it used. The segfault was in an area of code I didn't even expect that this PAM module would ever call. Back in 1994, that would have been a painful slog. Gdb has improved significantly since then, and I'd really like to thank all the people over the years who made that possible. I was able to isolate the problem in just a couple of hours of debugging. Here are some of the cool features I used: Anyway, with just a couple of hours and no instrumentation of the code, I managed to track down how a bunch of structures were being freed as an unexpected side effect of one of the function calls. Neither I nor the author of the pam-p11 module expected that (although it is documented and does make sense in retrospect). Good tools make life easier.

15 January 2017

Sam Hartman: Musical Visualization of Network Traffic

I've been working on a fun holiday project in my spare time lately. It all started innocently enough. The office construction was nearing its end, and it was time for my workspace to be set up. Our deployment wizard and I were discussing. Normally we stick two high-end monitors on a desk. I'm blind; that seemed silly. He wanted to do something similarly nice for me, so he replaced one of the monitors with excellent speakers. They are a joy to listen to, but I felt like I should actually do something with them. So, I wanted to play around with some sort of audio project.
I decided to take a crack at an audio representation of network traffic. The solaris version of ping used to have an audio option, which would produce sound for successful pings. In the past I've used audio queues to monitor events like service health and build status.
It seemed like you could produce audio to give an overall feel for what was happening on the network. I was imagining a quick listen would be able to answer questions like:

  1. How busy is the network

  2. How many sources are active

  3. Is the traffic a lot of streams or just a few?

  4. Are there any interesting events such as packet loss or congestion collapse going on?

  5. What's the mix of services involved


I divided the project into three segments, which I will write about in future entries:

I'm fairly happy with what I have. It doesn't represent all the items above. As an example, it doesn't directly track packet loss or retransmissions, nor does it directly distinguish one service from another. Still, just because of the traffic flow, rsync sounds different from http. It models enough of what I'm looking for that I find it to be a useful tool. And I learned a lot about music and Linux audio. I also got to practice designing discrete-time control functions in ways that brought back the halls of MIT.

1 January 2017

Sam Hartman: 2016

I was in such a different place at the beginning of 2016: I was poised to continue to work to help the world find love. Professionally, I was ready to make a much needed transition and find new projects to work on. The year 2016 sucked. It feels like the year was filled with many different versions of the universe saying "Not interested in what you have to offer." At the beginning of the year, I had the energy to try and reach across large disagreements and help find common ground even when compromise was not possible. Now, my blog lies fallow because I cannot find the strength to be vulnerable enough to write what I would choose to say. Certainly a lot of the global changes of the last year have felt like a strong rejection of the world I'd like to see. However, many of the rejections have been personal. Beyond that, most of the people who stood as pillars of support in my life, together helping me find the strength to be vulnerable, are no longer available. When the universe sends such strong messages, it's a good idea to ask whether you are on the right path. I certainly have discovered training I need and things I need to improve in order to avoid making costly mistakes that hurt others. However, among the rejections were clear demonstrations of the value of reaching out with love and compassion. Besides, this is what I'm called to do. It's what I want to do. I certainly will not force it on anyone. But it looks like the next few years may be a hard struggle to find pockets of people interested in that work, finding people who will choose love even in the current world, along with some difficult training to learn from challenges of the last year. Amongst all this, my life if filled with love. There are new connections even as old connections are strained. There is always the hope of finding new ways to connect when the old ones are no longer right. I will rebuild and regain safety. I have the tools to do that. The process is just long and complicated.

11 September 2016

Niels Thykier: Unseen changes to lintian.d.o

We have been making a lot of minor changes to lintian.d.o and the underlying report framework. Most of them were hardly noticeable to the naked. In fact, I probably would not have spotted any of them, if I had not been involved in writing them. Nonetheless, I felt like sharing them, so here goes. User visible changes: In case you were wondering, the section title is partly a pun as half of these changes were intended to assist visually impaired users. They were triggered by me running into Sam Hartmann at DebConf16, where I asked him about how easy Debian s websites were for blind people. Allegedly, we are generally doing quite good in his opinion (with one exception, for which Sam filed Bug#830213), which was a positive surprise for me. On a related note: Thanks Luke Faraone and Asheesh Laroia for getting helping me started on these changes. Reporting framework / Internal changes: With the last change + the no generate reports option, we were able to schedule lintian more frequently. Originally, lintian only ran once a day. With the no generate reports , we added a second run and with the last changes, we bumped it to 4 times a day. Unsurprisingly, it means that we are now reprocessing the archive a lot faster than previously. All of the above is basically the all the note-worthy changes on the Lintian reporting framework since the Partial rewrite of lintian s reporting setup (~1 years ago).
Filed under: Debian, Lintian

10 November 2014

John Goerzen: Debian A plea to worry about what matters, and not take ourselves too seriously

I posted this on debian-devel today. I am also posting it here, because I believe it is important to more than just Debian developers. Good afternoon, This message comes on the heels of Sam Hartman s wonderful plea for compassion [1] and the sad news of Joey Hess s resignation from Debian [2]. I no longer frequently post to this list, but when you ve been a Debian developer for 18 years, and still care deeply about the community and the project, perhaps you have a bit of perspective to share. Let me start with this: Debian is not a Free Software project. Debian is a making-the-world-better project, a caring for people project, a freedom-spreading project. Free Software is our tool. As many of you, hopefully all of you, I joined Debian because I enjoyed working on this project. We all did, didn t we? We joined Debian because it was fun, because we were passionate about it, because we wanted to make the world a better place and have fun doing it. In short, Debian is life-giving, both to its developers and its users. As volunteers, it is healthy to step back every so often, and ask ourselves two questions: 1) Is this activity still life-giving for me? 2) Is it life-giving for others? I have my opinions about init. Strong ones, in fact. [3] They re not terribly relevant to this post. Because I can see that they are not really all that relevant. 14 years ago, I proposed what was, until now anyhow, one of the most controversial GRs in Debian history. It didn t go the way I hoped. I cared about it deeply then, and still care about the principles. I had two choices: I could be angry and let that process ruin my enjoyment of Debian. Or I could let it pass, and continue to have fun working on a project that I love. I am glad I chose the latter. Remember, for today, one way or another, jessie will still boot. 18 years ago when I joined Debian, our major concerns were helping newbies figure out how to compile their kernels, finding manuals for monitors so we could set the X modelines properly, finding some sort of Free web browser, finding some acceptable Office-type software. Wow. We WON, didn t we? Not just Debian, but everyone. Freedom won. I promise you 18 years from now, it will not matter what init Debian chose in 2014. It will probably barely matter in 3 years. This is not key to our goals of making the world a better place. Jessie will still boot. I say that even though my system runs out of memory every few days because systemd-logind has a mysterious bug [4]. It will be fixed. I say that even though I don t know what init system it will use, or how much choice there will be. I say that because it is simply true. We are Debian. We will make it work, one way or another. I don t post much on this list anymore because my personal passion isn t with posting on this list anymore. I make liberal use of my Delete Thread keybinding on -vote these days, because although I care about the GR, I don t care about it enough to read all the messages about it. I have not yet decided if I will spend the time researching it in order to vote. Instead of debating the init GR, sometimes I sit on the sofa with my wife. Sometimes I go out and fly the remote-control airplane I m learning to fly. Sometimes I repair my plane after a flight that was shorter than planned. Sometimes I play games with my boys, or help them with homework, or share my 8-year-old s delight as a text file full of facts about the Titanic that he wrote in Emacs comes spitting out of the printer. Sometimes I write code or play with the latest Linux filesystems or build a new server for my basement. All these things matter more to me than init. I have been using Debian at home for almost 20 years, at various workplaces for almost that long, and it is not going to stop being a part of my life any time soon. Perhaps I will have to learn how to administer a new init system. Well, so be it; I enjoy learning new things. Or perhaps I will have to learn to live with some desktop limitations with an old init system. Well, so be it; it won t bother me much anyhow. Either way, I m still going to be using what is, to me, the best operating system in the world, made by one of the world s foremost Freedom projects. My hope is that all of you may also have the sense of peace I do, that you may have your strong convictions, but may put them all in perspective. That we as a project realize that the enemy isn t the lovers of the other init, but the people that would use law and technology to repress people all over the world. We are but one shining beacon on a hill, but the world will be worse off if our beacon winked out. My plea is that we each may get angry at what matters, and let go of the smaller frustrations in life; that we may each find something more important than init/systemd to derive enjoyment and meaning from. [5] May you each find that airplane to soar freely in the skies, to lift your soul so that the joy of using Free Software to make the world a better place may still be here, regardless of what /sbin/init is. [1] https://lists.debian.org/debian-project/2014/11/msg00002.html [2] https://lists.debian.org/debian-devel/2014/11/msg00174.html [3] A hint might be that in my more grumpy moments, I realize I haven t ever quite figured out why the heck this dbus thing is on so many of my systems, or why I have to edit XML to configure it ;-) [4] #765870 [5] No disrespect meant to the init/systemd maintainers. Keep enjoying what you do, too!

8 February 2014

Sam Hartman: IETF Takes on Challenge of Internet Monitoring

At IETF 88, we held a plenary discussion of how we could harden the Internet against ongoing monitoring and survalence. There were no significant surprises in what people said about monitoring. So, we had an opportunity to focus on what the IETF as a standards organization responsible for technical standards that drive the Internet can do about this problem. I think we made amazing progress. The IETF works by consensus. We discuss issues, and see if there s some position that can gain a rough consensus of the participants in the discussion. After a couple of hours of discussion, Russ Housley asked several consensus questions. The sense of the room is that we should treat these incidents of monitoring as an attack and include them in the threats we try and counter with security features in our protocols. The room believes that encryption, even without authentication has value in fighting these attacks. There was support for the idea of end-to-end encryption is valuable even when there are middle boxes. IETF decisions made in meetings are confirmed on public mailing lists, so the sense of the room is not final. Also, note that I did not capture the exact wording of the questions that were posed. This is huge. There is very strong organizational agreement that we re going to take work in this space as seriously. Now that we ve decided pervasive monitoring is an attack, anyone can ask how a proposed protocol (or change to a protocol) counters that attack. If it doesn t handle the attack and there is a way to address the attack, then we will be in a stronger position arguing the threat could be addressed. In addition, the commitment to encryption providing value without authentication will be useful in providing privacy and minimizing fingerprinting by passive attackers. The IETF is only one part of the solution. We can work on the standards that describe how systems interact. However, implementations, policy makers, operators and users will also play a role in securing the Internet against pervasive attacks.

Sam Hartman: Debian: Init Interfaces Our Users *and* Free Software

Recently, I ve been watching the Debian Schadenfreude related to init systems. For those who do not know, Debian is trying to decide whether Systemd or Upstart will be used to start software on Debian. There are two other options, although I think Systemd and Upstart are the main contenders. Currently the Technical Committee is considering the issue, although there s some very real chance that the entire project will get dragged through this swamp with the general resolution process. This is one of those discussions that proves that Debian truly is a community: if we didn t have something really strong holding us together we d never put up with something this painful. Between the accusations that our users now know the persecution European pagans faced at the hands of Christians as the Systemd priests drive all before them, a heated discussion of how the Debian voting system interacts with this issue, two failed calls for votes, a discussion of conflicts of interests, and a last-minute discussion of whether the matter had been appropriately brought before the Technical committee (and if so, under what authority), there s certainly schadenfreude to go around if you re into that sort of thing. However, through all this, the Technical Committee has been doing great work understanding the issues; it has been a pleasure to watch their hard work from the side lines. I d like to focus on one key question they ve found: how tightly can other software depend on the init system. Each init system offers some nice features. Upstart has an event triggering model. Systemd manages login sessions and at least in my opinion has a really nice service file format. Can I take advantage of these in my software. If I do, then users of my software might need to use a particular init system. Ian Jackson argues that we should give our users the choice of what init system to run. He reminds us that Debian is a community that supports diversity and diverse goals. We support multiple desktop systems, web browsers, etc. Wherever we can, we support people working on their goals and give developers the opportunity to make it work. When developers succeed, we give our users the opportunity to choose from among the options that have been developed. I think of Ian s argument as an appeal to part of our Social Contract. Clause 4 of the social contract begins:
Our priorities are our users and free software.
We will be guided by the needs of our users and the free software community. We will place their interests first in our priorities. We will support the needs of our users for operation in many different kinds of computing environments.
Ian is right that our users will be served by choice in init systems. However, this trade off needs to be balanced against the needs of the free software community. Diversity is an important goal but it should not come at the price of stagnation and innovation. If I want to avoid using init scripts because they don t provide restart on failure, because they are hard to write correctly, and because they don t provide access to modern security isolation techniques, I should be able to do that. If Systemd service files provide a superior solution, I should be able to work toward using them. If the desire for init system diversity shuts down my ability to find like-minded people who want to take advantage of improvements in init systems and work towards that goal, then we ve significantly damaged Debian as a forum for technical cooperation. Early in my work as a Debian Developer, Anthony Towns taught me an interesting principle. Those who do the work get significant influence over how and what gets done. Ian s right that we need to enable people to work towards choice in init systems. Those like-minded people need to be given a chance to find each other and pursue their goal. However the cost of success should be on the shoulders of those who value choice in init systems. It should not come at the cost of preventing people from depending on improvements in init systems. The best proposal I ve seen for balancing this is to enumerate stable interfaces that people want to use. Things like the logind interface or my favorite the service file interface. It needs to be possible to make another implementation of the interface. The interface needs to be stable enough that a dedicated team could have a chance of catching up with an implementation. At some point during the release cycle such interfaces would need to be frozen. However, I don t think it is reasonable to mandate that there are multiple implementations of the interface, only that there could be. The point is to give people a chance to work towards diversity in init systems, not to force it down peoples throats kicking and screaming until they leave the project or ignore our processes. Steve Langasek and Colin Watson seem to be working towards this. It s possible there s another approach besides interfaces. My main concern is the same as Ian s: maintain Debian as a forum for people to pursue their goals and work together. I suspect we see the conflict in these goals differently. I hope that we as a project can explore this conflict and figure out where we have common ground. I commit to exploring how we can work towards init choice in my frameworks; I ask those who prioritize init choice to commit to explore how we can take advantage of new features in their framework.

4 April 2013

Russ Allbery: Debian packaging of Git upstreams

Since there's a discussion of packaging software for Debian that uses Git upstream on Planet Debian right now, I wanted to weigh in and advocate for my current workflow for this situation, which I'm quite fond of. It's worth noting that I'm also upstream for quite a few of the packages I maintain, all in Git, and I use (almost) exactly the same structure for packaging my own software as for packaging anyone else's. So I have some experience with both sides of this. First off, I completely agree with Joey: if upstream is already using Git, there's no reason not to base the Debian packaging on the upstream repository, and many, many reasons to do so. One of the biggest advantages is that when repositories share a common basis and have been regularly merged, you can easily cherry-pick commits, which is wonderful for security releases and situations where you need a quick bug fix from an unreleased upstream branch. I make very heavy use of this when packaging OpenAFS. I do, however, like to base the Debian packaging on the released tarball, if for no other reason than that's the artifact that other people can more easily confirm. Yes, you can do the same thing with a Git tag, but the tarball is what upstream considers a release, so if one is available, I think it makes the most sense to base the packaging on it. I do this even for my own software. Thankfully, it's not that difficult to do both. Sam Hartman was the one who showed me this technique, and (after I used a manual script for some time for a couple of packages) Guido G nther incorporated the support into git-import-orig. The key idea is to still import the tarball into the upstream branch, but instead of making that import a simple commit, you make it a merge commit referencing the upstream release tag or commit from their Git repository. This means that you still get the exact contents of the release tarball on the upstream branch (and pristine-tar works as normal), but that branch is also based on the full upstream line of development. Therefore, so is your packaging branch (master or what have you) since you merge upstream into it. You can then charry-pick and take advantage of all of the normal Git features when following upstream development. This is dead simple to do with git-import-orig. Just add the upstream repository as a remote for your Git repository, make sure it's up to date with git fetch and you have the upstream tags, and then pass the flag --upstream-vcs-tag <tag> to git-import-orig whenever importing the upstream release tarball. git-import-orig will handle the construction of the merge commit for you and everything will just work, exactly like it normally does with git-buildpackage except with a more complete history. This support was added in git-buildpackage 0.6.0~git20120324, so it's available in unstable and testing. (I was going to update my notes on Debian packaging with Git to include this information before posting this, but I see that it will require some restructuring and quite a few changes to that document and I don't have time tonight. Hopefully soon.)

15 April 2012

Sam Hartman: Moonshot and RDSI

Moonshot continues to be busy. Lately we ve been focusing on finishing our core technical specs, better understanding how Moonshot will be deployed and working on our trust infrastructure. At the same time, we re beginning to watch organizations evaluate whether Moonshot addresses a need they have. I m excited by this process because I like to see technology I work on adopted and because the feedback we get is very valuable. This week though, I personally get to participate in such an exercise. Tomorrow I ll be speaking at the Australian Research Data Storage Initiative s workshop on Moonshot. I ll be giving background on the project, talking about community success, and talking about how Moonshot can help Australia. I m looking forward to that. I m also very excited about a brainstorming exercise I ll be participating in today. Several key participants in the RDSI project and I will get together to carefully evaluate their needs and see what it would take for a Moonshot solution. I hope Moonshot does end up being a good fit. Regardless, I enjoy this sort of problem solving session and am happy to have the opportunity to sit down with knowledgeable people and see how we can solve real problems!

6 December 2011

Sam Hartman: Moonshot Introduction

I recently put together a reading list on Project Moonshot for a friend. If you have seen discussions of Moonshot but not known where to get started understanding the technology, here is a fairly good initial list. It s long, but take a look starting at the beginning and let us know what you think.Take a look at http://www.project-moonshot.org/. Specifically, http://www.project-moonshot.org/sites/default/files/moonshot-feasibility-analysis.pdf and http://www.project-moonshot.org/sites/default/files/moonshot-briefing-ietf-78.pdf That briefing paper contains outdated versions of the technical
specifications.
Please see http://tools.ietf.org/html/draft-ietf-abfab-arch-00 http://tools.ietf.org/html/draft-ietf-abfab-gss-eap and http://tools.ietf.org/html/draft-ietf-abfab-gss-eap-naming
and http://tools.ietf.org/html/draft-ietf-abfab-aaa-saml O, yeah, and for the totally cool stuff that is still being designed
please see http://tools.ietf.org/html/draft-mrw-abfab-multihop-fed and http://tools.ietf.org/html/draft-mrw-abfab-trust-router

12 October 2011

Sam Hartman: Moonshot SSP

It s been a while since I ve written about Moonshot. A lot has gone on; we ve been too busy doing to be busy blogging. However there s something that s happened recently that s so cool I had to take a moment to discuss it. Padl Software, the same people (well person) who brought us LDAP support to replace NIS and the first Active Directory clone, has now produced a GSS-EAP Security Service Provider. That s software that implements the Moonshot protocol and plugs it into the standard Windows security infrastructure. This is neat because it allows you to use GSS-EAP with unmodified Windows applications like Internet Explorer and Outlook/Exchange. Obviously, this will be great for Moonshot. However, I think the positive affects are more far-reaching than that. Luke has demonstrated that we can evolve the Windows security infrastructure without waiting for Microsoft to lead the way. For those of us working in the enterprise security space, that s huge. We can innovate and bring our innovation to Windows. In terms of getting acceptance in important user communities, getting funding for work, and making a practical difference, that s a big deal. This code is still in the early stages. Padl has not decided how the code will be made available. We don t know if it will be under an open-source license yet. Luke, naturally wants to get paid for his work. However if this code does get released under an open-source license, it will be very valuable. That will give all of us who are looking for a starting point for security innovations a starting point for bringing our innovations to Windows. Some in the open-source community will argue that we shouldn t work on improving Windows: if the open-source platforms have features Windows does not, then it may drive people to open-source. Especially for enterprise infrastructure, it tends not to work that way. You need broad cross-platform support to drive new technology. However, it does mean that we can take control of the evolution of our infrastructure; even for Windows there is no requirement that a single vendor controls what is possible.

2 September 2011

Asheesh Laroia: Debian bug squashing party at SIPB, MIT


(Photo credit: Obey Arthur Liu; originally on Picasa, license.) Three weekends ago, I participated in a Debian bug squashing party. It was more fun than I had guessed! The event worked: we squashed bugs. Geoffrey Thomas (geofft) organized it as an event for MIT's student computing group, SIPB. In this post, I'll review the good parts and the bad. I'll conclude with beaming photos of my two mentees and talk about the bugs they fixed. So, the good:

The event was a success, but as always, there are some things that could have gone more smoothly. Here's that list: Still, it turned out well! I did three NMUs, corresponding to three patches submitted for release-critical bugs by my two mentees. Those mentees were: Jessica enjoying herself Jessica McKellar is a software engineer at Ksplice Oracle and a recent graduate of MIT's EECS program. She solved three release-critical bugs. This was her first direct contribution to Debian. In particular: Jessica has since gotten involved in the Twisted project's personal package archive. Toward the end of the sprint, she explained, "I like fixing bugs. I will totally come to the next bug squashing party." Noah grinning Noah Swartz is a recent graduate of Case Western Reserve University where he studied Mathematics and played Magic. He is an intern at the MIT Media Lab where he contributes to DoppelLab in Joe Paradiso's Responsive Environments group. This was definitely his first direct contribution to Debian. It was also one of the most intense command-line experiences he has had so far. Noah wasn't originally planning to come, but we were having lunch together before the hackathon, and I convinced him to join us. Noah fixed #625177, a fails-to-build-from-source (FTBFS) bug in nslint. The problem was that "-Wl" was instead written in all lowercase in the debian/rules file, as "-wl". Noah fixed that, making sure the package properly built in pbuilder, and then spent some quality time with lintian figuring out the right way to write a debian/changelog. That's a wrap! We'll hopefully have one again in a few months, and before that, I hope to write up a guide so that we run things even more smoothly next time.

2 August 2011

Russ Allbery: Git Debian packaging notes revised

It's a new month, so as usual for the beginning of the month I'm hoping to be a bit better about regularly updating, and ideally with more than just software release announcements and book reviews. Sadly, today has not been a good day for concentration despite a good early start, but I did manage to finally get through a long-delayed revision of my page of notes on using Git for Debian packaging. The revision drops the idea of using a separate debian branch to track Debian-specific changes so that master could be a pure feature branch. While appealing for its conceptual purity, this ended up confusing people horribly and was extra work for no apparent benefit. There are also a variety of other revisions to bring things up to date and a few new minor tips. Unfortunately, I haven't yet had a chance to write up how to maintain the upstream branch as a merge between the upstream Git repository and the released tarball, an idea from Sam Hartman that I'm now using very successfully for the openafs package, but at least now there's a pointer to it. I'm not really sure what to do with the initial part of that page that offers an introduction to Git. I've now written about three of these in separate places, and I'm still unconvinced of how much they actually help. And there are a ton of tools and basic techniques that I use all the time in Git that aren't mentioned in there (like git pull --rebase --stat, which I aliased to git update, or the whole concept of aliases, or how this all plays with 3.0 source packages). I've left the basic instructions for now, but I think at some point I may cede the tutorial aspect to other sites and just collect neat tricks that aren't part of the basic workflow that everyone describes.

28 June 2011

Sam Hartman: Dream Plug

As part of trying to help with Freedom Box, I gained access to a Dream Plug The Dream Plug is one of the leading possible platforms for Freedom Box. Computationally it s sexy for a low-power device. There s a 1200 mhz arm with 512m of RAM. As a platform to enable people to create novel applications, it s great. It has multiple USB ports, two ethernets, a built-in 802.11B/G access point, bluetooth, audio, ESATA, Micro SD and full-sized SD. So, whether you application needs storage, networking, audio, or some interesting side device, you re covered. With the optional JTAG adapter it s even fairly friendly to developers: there are options for recovering from most failures and full console access. If you re looking at a reference platform that s still nominally embedded, but that allows you to play around thinking about what your application could do, this is a great option. However, especially for Freedom Box, it s important to remember that the reference board a developer wants is not the same as a cost-reduced product on which to actually deploy something for people to buy. The Dream Plug is wrong for every actual deployment I ve imagined. First, it s huge. If you happen to have free power plugs on your wall with lots of space above and below them, it might be an option. If however, you re like everyone I know and have power strips, constrained spacing, etc, then the industrial engineering will disappoint at every turn. I was hoping for something that kind of looked like an Apple Airport Express. It s significantly larger than that in every dimension. Also, the plug is oriented the wrong way for minimizing the fraction of the power strip it takes up. The power supply part of the device can be detached, although even that is way too huge for a power strip, and the cable between the power supply and the computer is fairly short. You can reconfigure the device without its plug nature: running a cord from a normal power outlet to the power supply. But then if you have to detach the power supply for heat management reasons, you get a cord from the outlet to the power supply and another cord from the supply to the device. Add a few USB or ESATA devices and the octopus of cables begins to resemble some arcane mechanism. (So far, no elder gods have appeared though.) The other issue is that you almost never want all the functionality. You pay for the bluetooth, audio and ESATA in terms of cost, heat and space regardless of whether you use them. I don t have a lot of applications that really take advantage of the full array of hardware. The firmware update mechanism is decidedly not targeted at end-users. The version I obtained had a hacked copy of Debian lenny. No mechanism was provided for replacing the image in a safe manner that did not potentially require the optional highly-non-end-user-compatible JTAG board if something got interrupted. You could either unscrew the device and get to the micro SD card containing the image, or run software to replace the image from within Debian lenny. It s possible to configure the boot loader to run an update image off USB or SD until that succeeds, but doing that is also a non-end-user operation. In conclusion, the name is perfect. This is exactly what hackers need to dream about the power of small computers everywhere. However we must not forget that there s a step required to turn dreams into reality. Just as with any fully proprietary product, Freedom Box will require cost reduction steps, semi-custom boards and actual OEMs to truly be usable. The claim in the previous sentence that the Freedom Box may have proprietary elements is disquieting to some. I think we can put together a software stack that is free with the possible exception of some device firmware. However, my suspicion is that anyone who turns that into a fully realized end-user product will add proprietary elements. I suspect some of the results of the cost reduction process such as resulting semi-custom boards will be proprietary. In many cases though, I suspect some proprietary software elements will be introduced.

24 June 2011

Rapha&#235;l Hertzog: People behind Debian: Sam Hartman, Kerberos package maintainer

Sam Hartman is a Debian developer since 2000. He has never taken any sort of official role within Debian (that is besides package maintainer), yet I know him for his very thoughtful contributions to discussions both on mailing lists and IRL during Debconf. Until I met him at Debconf, I didn t know that he was blind, and the first reaction was to be impressed because it must be some tremendous effort to read the volume of information that Debian generates on mailing lists. In truth he s at ease with his computer much like I am although he uses it in a completely different way. Read on to learn more, my questions are in bold, the rest is by Sam. Raphael: Who are you? Sam: I m Sam Hartman. I m a 35-year-old software engineer. I am a principal consultant and co-owner of a small consulting company, called Painless Security. I started using Debian in the mid 1990 s around the time of the bo release. I ended up deciding to join the project as a developer in 2000. Raphael: You re blind and yet you re using your computer as effectively as I am. Can you explain us how you setup your computer ? Sam: I gave a talk at Debconf9 on how my computer is set up; you can watch the video for full details. My main laptop runs Debian. I use the gnome-orca package as my primary screen reader. It speaks the Gnome desktop. It does a relatively good job of speaking Iceweasel/Firefox and Libreoffice. While it does speak gnome-terminal, it s not really good enough at speaking terminal programs that I am comfortable using it. So, I run Emacs with the Emacspeak package. Within that, I run the Emacs terminal emulator, and within that, I tend to run Screen. For added fun, I often run additional instances of Emacs within the inner screens. Raphael: Are there important problems in Debian in terms of accessibility to blind people? KDE documentation talks a lot about accessibility, but at least for blind users, the code completely fails to deliver. That means there are a lot of good packages a blind user cannot use. The non-free Adobe Flash player has some accessibility, but it could be a lot better. The free alternatives have none. The free PDF readers have basically no accessibility. You can use pdftotext, but you cannot actually read a PDF in a graphical application. It s way too easy for a misbehaving program to lock up the entire accessibility infrastructure. Gnash is a big culprit here: if my Iceweasel starts Gnash, there s a good chance that either Iceweasel or the entire desktop will appear to hang from an accessibility standpoint. Other programs, including gksu tend to fail in this way. Some of the more dynamic website features like pop up menus or selection lists are really difficult to find and click without causing them to disappear. This gets better and worse over time as the accessibility support in Iceweasel changes and as websites change. Raphael: What s your biggest achievement within Debian? Sam: I decided to be a Debian developer because I thought that in 2000, Debian support for using enterprise security and infrastructure software was lacking. Back then, any software that included crypto functionality was segregated into a special non-us archive. Some of the software was missing; I started by packaging MIT Kerberos for Debian. Other software had security or enterprise features disabled in the packages. At the height of working on this, I was maintaining krb5, some SASL modules, PAM, some PAM modules,OpenAFS, a version of and Ssh with Kerberos support. I was also involved in the effort to resolve the legal issues so that we could move security software into the main archive. I think this work has been a huge success. In fact, it s been such a success that other people are now doing most of the work. I still maintain the krb5 package. When I started I felt like I was pushing against the flow trying to get people to add patches, sometimes even having to fork a package. However, now, I can maintain just one component and there are enough others who shared my original goal that the work continues. These days, I m working on something that s an evolution of this enterprise security work. I m packaging Project Moonshot for Debian and Ubuntu. Project Moonshot is a response to the wide variety of identity management systems such as OpenID, Oauth, SAML, and the like. Moonshot works to create an identity management approach that works well both for the web and for client-server and other applications. The project is a lot of fun, but the role of Debian in the project is also interesting. One of the things we want to show with Moonshot is how our technology can be effective when it is integrated throughout a platform. To really show the potential it needs to work with as many applications in a given platform as possible. The open and collaborative nature of the Debian community makes it possible to introduce a new technology and have that technology evaluated based on its merits. We don t need huge business relationships or marketing deals to integrate our technology: we need some combination of doing the work ourselves and showing others the benefits of working with us. For someone trying to do innovative work, the Debian model is powerful. Raphael: If you could spend all your time on Debian, what would you work on? Sam: I d really love to work with the embedded Debian folks. The vision of a single source base that could be the stack from devices from small embedded devices all the way up to high-end servers is very appealing. Doing that with Debian involves a number of challenges. However with the right people working on meeting these challenges full-time, I think we could offer something promising. I d also love to have the time to contribute to project infrastructure: working on the release team, helping ftpmaster, that sort of thing. However I don t. I m just happy that so much of my consulting practice involves working on open-source software. Raphael: What s the biggest problem of Debian? Sam: There s something really not right about how we transition libraries from unstable to testing. Every time I get involved in a library transition I m shocked at how complicated it is and how disruptive it is both to testing and unstable. We need to look at technology and processes to break up the dependency snarles. For example we don t have good archive tools for keeping old versions of libraries around to ease transitions. If you haven t thought about this issue you ll probably say that I m being overly picky and this can t be the major problem for Debian. However, if you think about how much this impacts our ability to introduce things into unstable around the times of the freeze or about how much it slows the release process, you may begin to appreciate how big of an issue this is. Raphael: Is there someone in Debian that you admire for their contributions? Sam: There are a number of people who have been role models for me over the years. Anthony Towns really helped me understand a lot of what drives free-software projects and what needs to be true for positive motivation. Joey Hess showed us all that sometimes, social problems do have technical solutions. If the tools are so good that doing the right thing is far easier than any other course of action, quality improves.
Thank you to Sam for the time spent answering my questions. I hope you enjoyed reading his answers as I did. Subscribe to my newsletter to get my monthly summary of the Debian/Ubuntu news and to not miss further interviews. You can also follow along on Identi.ca, Twitter and Facebook.

5 comments Liked this article? Click here. My blog is Flattr-enabled.

15 March 2011

Sam Hartman: Moonshooting Jabber

Last fall, Moonshot was steaming forward. We ran into some non-technical obstacles and progress on the implementation was disturbingly quite from the end of October through February. That changed: the code was released February 25. Since then, the project has picked up the momentum of last fall. There s a new developers corner with helpful links for participating in the project, obtaining the code, and preparing for our upcoming Second Moonshot Meeting. Standards work in the ABFAB working group has been making steady progress the entire time. The jabber chat room has been quite active. Developers have been working in three time zones. Whenever In get up there s likely to be interesting progress awaiting me and new things to work on in the chat logs. Today was no exception. Luke moonshooted jabber. This is exciting: it s the first tim our code has been used to authenticate some real application instead of a test service. Other discussion from the chat room not reflected in e-mail is equally exciting. He has Moonshot working with OpenSSH in controlled environments. It appears to require some updates to the OpenSSH GSS-API support. Now is a really great time to get involved in Moonshot. We hope to see you on our lists and in our chat. With last night s news, we need to think towards eating our own dogfood and using Moonshot to authenticate to our own Jabber server and to authenticate to our repository for commits. Right now, there are some security issues with the code (lack of EAP channel binding) that might make that undesirable. However in a very small number of weeks or months I expect we will be there!

8 March 2011

Sam Hartman: V6 Really is that Hard

Sometimes I begin to think that we ve solved most of the challenges to IPv6 deployment. Then something happens. This time it was a DAP-1522 access-point. Not a NAT, not a router, just a layer 2 device. A while after deploying the device, I noticed that sometimes mail failed to work. After attempting to debug the problem was that the device wasn t getting an IPv6 address. The router appeared to be sending out advertizments. Other machines on the same subnet were working fine. This laptop had associated with the new access point. The default configuration helpfully includes IGMP snooping. The IGMP snooping detected that no one subscribed to any IPv4 multicast group corresponding to the router advertizements and thus didn t forward them to the wireless link. We have a long way to go if layer 2 devices sold today are incompatible with v6 in their default configurations.

Next.

Previous.